317 research outputs found

    An English-Italian MWE dictionary

    Get PDF
    La traduzione delle polirematiche richiede la conoscenza del corretto equivalente nella lingua di arrivo che raramente è il risultato di una traduzione letterale. Questo contributo si basa sul presupposto che il corretto trattamento delle polirematiche in applicazioni di Trattamento Automatico del Linguaggio (TAL) ed in particolare di Traduzione Automatica e nelle tecnologie per la traduzione, più in generale, richiede un approccio computazionale che deve essere, almeno in parte, basato su dati linguistici, ed in particolare su una descrizione linguistica esplicita delle polirematiche, mediante l’uso di un dizionario macchina ed un insieme di regole. L'ipotesi è che un approccio linguistico può integrare le metodologie statisticoprobabilistiche per una corretta identificazione e traduzione delle polirematiche, poiché risorse linguistiche quali dizionari macchina e grammatiche locali ottengono risultati accurati per gli scopi del TAL. La metodologia adottata per questa ricerca si basa su (i) Nooj, un ambiente TAL che permette lo sviluppo e la sperimentazione di risorse linguistiche, (ii) un dizionario macchina Inglese- Italiano di polirematiche, (iii) un insieme di grammatiche locali. Il dizionario è costituito principalmente da verbi frasali, verbi supporto, espressioni idiomatiche e collocazioni inglesi e contiene diversi tipi di modelli di polirematiche nonché la loro traduzione in lingua italiana.The translation of Multiword Expressions (MWEs) requires the knowledge of the correct equivalent in the target language which is hardly ever the result of a literal translation. This paper is based on the assumption that the proper treatment of MWEs in Natural Language Processing (NLP) applications and in particular in Machine Translation and Translation technologies calls for a computational approach which must be, at least partially, knowledge-based, and in particular should be grounded on an explicit linguistic description of MWEs, both using an electronic dictionary and a set of rules. The hypothesis is that a linguistic approach can complement probabilistic methodologies to help identify and translate MWEs correctly since hand-crafted and linguisticallymotivated resources, in the form of electronic dictionaries and local grammars, obtain accurate and reliable results for NLP purposes. The methodology adopted for this research work is based on (i) Nooj, an NLP environment which allows the development and testing of the linguistic resources, (ii) an electronic English-Italian MWE dictionary, (iii) a set of local grammars. The dictionary mainly consists of English phrasal verbs, support verb constructions, idiomatic expressions and collocations together with their translation in Italian and contains different types of MWE POS pattern

    Translator's knowledge in the cloud: the new translation technologies

    Get PDF
    After machine translation, the translation technology market is witnessing a new revolution. Relevant changes are taking place under emerging phenomena of the Web such as crowdsourcing, i.e., the exploitation of a community/group of people to perform tasks normally performed by employees and cloud computing technologies, which enable ubiquitous access to digital content and online multilingual translation tools. In particular, the combination of crowdsourcing and cloud models of automatic/assisted translation is taking place on a large scale, combined with the availability of tools shared in translation environments. This contribution will analyze the impact of the new collaborative translation technologies on the translation process and the working practices of translators, highlighting the possible implications in the field of translation teaching

    MWE processing in Machine Translation: State of the Art and Open Challenges

    Get PDF
    The poster describes the state of the art and the open challenges in MWE processing in Machine Translatio

    A knowledge-based approach to multiwords processing in machine translation: the English-Italian dictionary of multiwords

    Get PDF
    This poster presents a knowledge-based approach to the identification and translation of multiword expressions (MWEs) from English to Italian. The main assumption of the methodology proposed is that the proper treatment of MWEs in MT calls for a computational approach which must be, at least partially, knowledge-based, and in particular should be grounded on an explicit linguistic description of MWEs, both using a dictionary and a set of rules. Empirical approaches bring interesting complementary robustness-oriented solutions but taken alone, they can hardly cope with this complex linguistic phenomenon for various reasons. For instance, statistical approaches fail to identify and process non high-frequent MWEs in texts or, on the contrary, they are not able to recognise strings of words as single meaning units, even if they are very frequent. Furthermore, MWEs change continuously both in number and in internal structure with idiosyncratic morphological, syntactic, semantic, pragmatic and translational behaviours. The hypothesis is that a linguistic approach can complement probabilistic methodologies to help identify and translate MWEs correctly since hand-crafted and linguistically-motivated resources, in the form of electronic dictionaries and local grammars, obtain accurate and reliable results for NLP purposes. The methodology adopted for this research work is mainly based on the following elements: • an NLP environment which allows the development and testing of the linguistic resources. • an electronic E-I MWE dictionary, based on an accurate linguistic description that accounts for different types of MWEs and their semantic properties by means of well-defined steps: identification, interpretation, disambiguation and finally application. • a set of local grammars We will provide details about the methodology that can be applied to the identification and translation of MWEs. 1. NooJ: an NLP environment for the development and testing of MWE linguistic resources NooJ is a freeware linguistic-engineering development platform used to develop large-coverage formalised descriptions of natural languages and apply them to large corpora, in real time. The knowledge bases used by this tool are: electronic dictionaries (simple words, MWEs and frozen expressions) and grammars represented by organised sets of graphs to formalise various linguistic aspects such as semi-frozen phenomena (local grammars), syntax (grammars for phrases and full sentences) and semantics (named entity recognition, transformational analysis). NooJ’s linguistic engine includes several computational devices used both to formalise linguistic phenomena and parse texts such as FSTs, FSAs, Recursive Transition Networks (RTNs), Enhanced Recursive Transition Networks (ERTNs), Regular Expressions (RegExs), Context Free Grammars (CFGs). NooJ is a tool that is particularly suitable for processing different types of MWEs and several experiments have already been carried out in this area: for instance, Machonis (2007 and 2008), Anastasiadis, Papadopoulou & Gavriilidou (2011), Aoughlis (2011) and finally Vietri (2008). These are only a few examples of the various analysis performed in the last few years on MWE using NooJ as an NLP development and testing environment. 2. The Dictionary of English-Italian MWEs The EIMWE.dic is a dictionary used to represent and recognise various types of MWEs. This dictionary is based on a contrastive English-Italian analysis of continuous and discontinuous MWEs with different degrees of variability of co-occurrence among word compositionality and different syntactic structures. The translation of MWEs requires the knowledge of the correct equivalent in the target language which is hardly ever the result of a literal translation. Given their arbitrariness, MT has to rely on the availability of ready solutions in both languages in order to perform an accurate translation process. Each entry of the dictionary is given a coherent linguistic description consisting of: • the grammatical category for each constituent of the MWE: noun (N), Verb (V), adjective (A), preposition (PREP), determiner (DET), adverb (ADV), conjunction (CONJ); • one or more inflectional and/or derivational paradigms (e.g. how to conjugate verbs, how to nominalise them), preceded by the tag +FLX; • one or more syntactic properties (e.g. “+transitive” or +N0VN1PREPN2); • one or more semantic properties (e.g. distributional classes such as “+Human”, domain classes such as “+Politics”); • the translation into Italian. The EIMWE.dic contains different types of MWE POS patterns. The main part of the dictionary consists of phrasal verbs, support verb constructions, idiomatic expressions and collocations. In the poster, the main verb structures are explained with examples extracted from the British National Corpus, from the Internet by means of the WebCorp LSE application or with our own examples together with the Italian translations. Finally, the corresponding dictionary entry for each example of MWE POS pattern is provided

    About adequacy, equivalence and translatability in human and machine translation

    Get PDF
    This paper examines the concepts of adequacy, equivalence and translatability in human translation and how in particular the concept of adequacy evolves with respect to the evaluation of the quality in Machine Translation. The paper starts with the analysis of the notions of translated sense and adequacy as discussed in translation theory and highlights how the considerations on the nature of human translation lose their theoretical strength if applied to Machine translation (MT). The different ways of conceiving the sense in human and machine translation with regard to the concepts of adequacy and equivalence, lead to different interpretations of the relationship between source and target text

    GENder-IT:An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena

    Get PDF
    Languages differ in terms of the absence or presence of gender features, the number of gender classes and whether and where gender features are explicitly marked. These cross-linguistic differences can lead to ambiguities that are difficult to resolve, especially for sentence-level MT systems. The identification of ambiguity and its subsequent resolution is a challenging task for which currently there aren't any specific resources or challenge sets available. In this paper, we introduce gENder-IT, an English--Italian challenge set focusing on the resolution of natural gender phenomena by providing word-level gender tags on the English source side and multiple gender alternative translations, where needed, on the Italian target side

    A Digital Storytelling Laboratory to Foster Second Language Acquisition in Higher Education: Students’ Perspectives and Reflections

    Get PDF
    Today’s technology-suffused society is inevitably changing and transforming the learning process as the role of technology in our lives is progressively increasing, thus making 21st Century teachers and educators face the challenge of both learning and understanding how to best integrate technology into the classroom and equipping students with the skills necessary to live and work in our digitized world. These skills, described by the Framework for 21st Century Learning are especially critical thinking, learning motivation, information literacy, media literacy, and language competence, considered as key competences for lifelong learning. In particular, as “The Council Recommendation on a comprehensive approach to the teaching and learning of languages" has recently stated, nowadays the lack of language competences represents a barrier in increasing productivity and collaboration across borders. As a consequence, to attain contemporary educational objectives, second language pedagogy needs to be integrated by the use of today’s digital tools that should not be considered as replacement of the traditional teaching method but as powerful, active support in fostering Second Language Acquisition (SLA). Specifically, Digital Storytelling (DST) is progressively emerging as an innovative instructional tool to enhance SLA together with students’ motivation, collaboration, reflection, and academic achievement. In fact, by combining traditional storytelling with digital multimedia, DST perfectly embodies the constructionist idea of learning by making, thus making students active participants in their learning process instead of passive agents as in face-to-face learning. Although various researches describe the use of DST in primary and secondary language education, to the best of our knowledge, very few studies have been conducted on the use of Digital Storytelling in Higher Education, especially in Italy where DST is a major innovation. As a consequence, a Digital Storytelling Laboratory has been enacted at “L’Orientale” University of Naples starting March 2019. It was addressed to 24 Bachelor’s students in the second year of their course in English Language and Linguistics. Firstly the students have been introduced in the field − almost completely new to them − and then involved in a Digital Storytelling Process that required the assimilation and completion of goal-oriented tasks, finally resulting in the production of a series of Digital Stories. This paper aims at exploring the impact of DST on academic development, learning motivation and collaboration of University students learning English as a second language. To that end, quantitative data were collected describing students’ perspectives and reflections about the effectiveness of DST in learning

    September 11, 1986

    Get PDF
    The Breeze is the student newspaper of James Madison University in Harrisonburg, Virginia
    corecore